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1 Introduction 

Music and language are the higher functions specific 
to human species. Thanks to high temporal and 
spatial resolution and high signal-to-noise ratio, 
MEG may allow us to dissociate the components 
that comprise an integrated function. Toward the 
understanding of the neural bases underlying music 
and language, we have been studying elemental 
processes related to melody, musical tones, letters, 
and words from the MEG responses. 

2 Experiments 

Experimental paradigm was based on the 
presentation of auditory or visual stimuli, i.e., tones 
or letter-strings (words), and execution of a mental 
task by subjects, during repeated measurement-runs 
for averaging. Different cognitive tasks were used, 
aiming to highlight a specific and elemental process 
that is required for the task using auditory or visual 
stimuli. MEG responses to the stimuli were analyzed 
in terms of equivalent current dipoles (CED), which 
may represent the spatial center of populational 
neural activities, in the individual brain. 

During the course of several studies, we used 
different SQUID magnetometers in the MEG 
recordings, including dual 37-channel system (Bti) 
in Okazaki National Institute for Physiological 
Sciences, 19-channel homemade system in our 
laboratory, and 204-channel system (Neuromag) in 
the Medical Hospital of Hokkaido University. 

3 Detection of melody incongruity 

3.1 Background 

A melody is a temporal sequence of pitches or 
harmonies, which fit the scale of a selected key. 
Incongruity occurs when the pitch or harmony of 
notes deviates within a phrase, and this deviation 
elicits a late positive wave or P300 of the ERP 
(event-related potential) [1]. A sentence in a 
language is composed of sequential words that are 
syntactically organized, where incongruity is 
introduced by altering the words either semantically 


or syntactically. Semantic and syntactic incongrui¬ 
ties elicit distinct ERP components of N400 and left 
anterior negativity, respectively, with different scalp 
topographies [2,3]. Here, we tried to determine the 
MEG responses that would reflect the incongruity in 
melody [4]. 

3.2 Melody stimuli and recording 

Seven normal hearing right-handed subjects with a 
mean age of 27.0 years (range; 21-34 years) 
participated in the experiment. End phrases were 
taken from 25 well-known songs, from which 
deviant phrases with an incongruent note (IN) were 
composed by modifying the keynote at the end of 
the phrase to out-of-key in different manners (see 
the score in Fig. 1). These IN phrases were mixed in 
random order with the same number of phrases that 
each had a congruent endnote (CN) in the original. 
MEG recordings were conducted using a 19-channel 
SQUID magnetometer over the right side of subjects' 
head in an area covering the auditory cortex, while 
subjects received the melodies in their left ear. The 
subjects were instructed to count the number of IN 
melodies and report it aurally at intermissions during 
the recording. The MEG signals were divided into 
CN and IN responses and averaged over 150 phrases 
for each response with reference to the onset of the 
endnote tone. Single equivalent current dipole 
(ECD) sources, that had goodness-of-fit values of 
more than 90%, were estimated by calculating their 
coordinates and moments in the head frame of 
subjects and registered in their MR images. 

3. 3 Results and discussion 

It was found that the Nlm response to the IN tone 
was significantly larger in amplitude than that for 
the CN tone (waveforms in Fig. 1). The Nlm peak 
was even unclear in the CN response of some 
subjects. Mean amplitude across subjects of root- 
mean-squared (rms) IN responses of 19-channels 
was significantly larger (p<0.01; paired t-test, n=7) 
than that of the CN response during a period of 110- 
150 ms after the Nlm peak and in later shorter 
periods. There was no significant difference in the 




Figure 1: Left upper: Example of the phrase of Japanese songs used as the melody stimuli, where the last-note 
was modified from the keynote (upper note) to out-of-key note (lower note). Left lower: Estimated area of 
dipole sources for the first-note and out-of-key last-note in the right hemisphere. Right: Superimposed 19- 
channel waveforms with a main Nlm peak, indicated by arrow, observed for the keynote and out-of-key note. 


latency [mean (SD)] of the Nlm peaks in the IN 
[111(28) ms] and CN [106 (23) ms] responses. 

The Nlm dipole source for the IN tone was located 
in the auditory cortex, at a site indistinguishable 
from the location of the Nlm source for the first- 
note tone (Fig. 1). However, reliable ECDs were not 
obtained for the CN response because of reduced 
amplitude. 

When compared with the Nlm response that was 
elicited by the first-note tone in the phrase, both the 
IN and CN responses were attenuated in amplitude. 
Thus, the enlargement of the IN response from the 
CN response indicates partial recovery of the 
reduced activity, suggesting that the IN response 
reflects neural detection of the incongruity of the 
melody. The latency of Nlm and the location of 
ECD sources further suggest a fast processing, as 
early as 100 ms, in the auditory cortex in the 
superior temporal plane. 

The enlarged Nlm wave in the IN response is 
distinct from the late positive waves, such as P300, 
which are observed in the EEG response following 
incongruent melody and harmony [1]. The IN 
response is also distinct from the N400 wave of 
potential and its magnetic counterpart, which appear 
following semantic incongruities in sentences [2,5]. 
From the latency, the IN response is close to the 
anterior negativity that is elicited at about 180 ms by 


syntactically incorrect sentences [3], though the 
frontal source is estimated from the topography of 
the negativity. 

4 Auditory cortical activities in musicians 
4.1 Background 

Musical training in childhood is important in the 
acquisition of absolute pitch, a kind of special 
musical skill with which one can recognize or sing 
any note without musical cues. This ability has been 
examined mainly by neuropsychological. On the 
other hand, an anatomical study using magnetic 
resonance imaging (MRI) showed that musicians 
who possess absolute pitch have strong leftward 
asymmetry of their planum temporale, a posterior 
part of the auditory cortex in the temporal lobe [6]. 
This finding indicates that musicians with absolute 
pitch have a larger left planum temporale than those 
who do not possess absolute pitch. 

In the previous MEG study it was shown that the 
Nlm response elicited by piano tones from the 
auditory cortex is about 25% greater in musicians 
than non-musicians [7]. This observation suggests 
plasticity by musical training in the auditory cortex, 
though the study was restricted to the left 
hemisphere. 

In this study, we sought to find functional correlates 



in the neural activity of the auditory cortex between 
well-trained musicians with absolute pitch and 
controls without absolute pitch, extending the MEG 
examination to the left and right hemispheres [8]. 

4.2 Subjects and auditory stimuli 

Eleven female college students majoring in music 
(20-22 years) participated in this experiment. All 
except one of them specialized in the piano and had 
been practicing from the age of 3-5. Their 
possession of absolute pitch was confirmed by a 
greater than 90% correct response to 50 different 
piano tones in a screening procedure, in which they 
were asked to identify the tones. For comparison, 
eleven female college students (19-22 years), who 
had never practiced on any instrument, participated 
in the same experiment. All 22 subjects were right- 
handed. 

Five different sounds, 4 tones and a noise burst (NB), 
were used as auditory stimuli. The tones included 
commonly known musical notes C4 (fundamental 
frequency of 263 Hz), C6 (1057 Hz), and E6 (1324 
Hz), and a pure tone C5p that has the same 
frequency as the fundamental frequency of the piano 
tone C5 (520 Hz). The NB had uniform frequencies 
between 200 and 4000 Hz. The subjects listened to a 
series of randomly presented sounds, during which 
they counted the incidence of the highest tone (E6). 
The target stimulus (E6) occurred 120 times, while 
the non-target stimuli (C4, C5p, C6, and NB) were 
presented 300 times each in individual recording. 
All stimuli were presented monaurally at 60 dB. 
Waveforms of the stimuli used are illustrated in the 
right side of Fig. 2. 

MEG measurements were carried out with a 19- 
channel SQUID magnetometer. The stimuli were 
presented to the contralateral side of the ear, while 
MEG responses were measured above the right and 
left side of their head in two sessions. The recorded 
MEG data were selectively averaged into different 
stimuli. Single ECDs were calculated for the Nlm 
component using the data in two-recording sets, 
which were obtained by shifting the center of the 
recording area slightly to the anterior and posterior 
direction from the auditory cortex. The calculated 
ECDs were selected on the basis of the goodness-of- 
fit values of greater than 90 and stability in the 
coordinates over 10 ms latencies. 

4.3 Results and discussion 

Clear Nlm responses were observed in 10 of 11 
(10/11) left and 8/10 right hemispheres measured for 


the subjects with absolute pitch (AP subjects), while 
clear Nlm responses were observed in 9/10 left and 
9/10 right hemispheres measured for the subjects 
without absolute pitch (non-AP subjects). Figure 2 
(left side) shows the mean ECD locations across 
subjects in each group, where the ECDs are plotted 
in the coordinates on an axial plane. 

Comparing the ECD locations between two groups, 
the ECDs of the AP and non-AP subjects were very 
close to each other and were not significantly 
different (p < 0.14, t-test) in any direction in the 
right hemisphere. In the left hemisphere, however, 
the ECDs of the AP subjects were located 6 mm to 
the posterior of those of the non-AP subjects. This 
separation was significant (p < 0.005) in a two- 
sample t-test. 

This result seems to correspond well to the 
anatomical enlargement of the planum temporale 
toward the posterior direction in the musicians who 
have absolute pitch, revealed by MRI study [6]. 
Thus, in addition to the anatomical asymmetry, our 
study also confirmed the presence of functional 
asymmetry of neural activities in the auditory cortex 
of AP musicians. We suggest that musicians with 
absolute pitch may have distinct neural processing of 
musical tones in the left auditory cortex. 

The left posteriority of the auditory cortex (about 1 
cm) to the right is commonly believed from the Nlm 
location obtained by MEG for normal subjects and 
the anatomical asymmetry proved by Geschwind and 
Levisky [9]. However, the MEG study also reported 
no inter-hemispheric difference in the Nlm position 
in female subjects [10], suggesting a gender 
difference in the anatomical structure. Our 
observations for non-AP female subjects agree with 
this, where there was no significant difference in the 
location of the ECDs between the left and right 
hemispheres. 

In the dependence of ECD locations on the stimuli in 
Fig. 2, the ECDs for the NB were located significantly 
(about 4 mm) posterior to those for the other tones in 
both hemispheres of the AP subjects [F(3,30) = 11.50, 
p < 0.0001]. In our previous MEG study [11] using 
monosyllabic speech sounds, high frequency plosive- 
and fricative-vowel speech sounds elicited posteriorly 
and laterally shifted Nlm activities in the left auditory 
cortex. Therefore, it is inferred that complex sounds 
having high frequencies tend to be processed in the 
more posterior part of the auditory cortex, and that the 
AP subjects, who have practiced instruments 
intensively, are highly sensitive to high frequency 
components of sounds. 
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Figure 2: Right side: Waveforms of sound stimuli (non target); piano tones C4 and C6, pure tones C5p that 
has the same frequency as the fundamental frequency of C5, and noise burst NB. Left side: Mean ECD 
locations across subjects in the AP (solid symbols) and non-AP (open symbols) subjects, where bars indicate 
standard errors. Symbols indicate: circles=C4, triangles=C5p, squares=C6, diamonds=NB. 


5 Word perception and recognition 
5.1 Background 

Noninvasive studies on language-related functions in 
normal subjects rely on the measurements mostly 
using PET, fMRI and MEG. In the study of visual 
word processing a center in which words are 
specifically represented has been located in the left 
medial extrastriate cortex of the occipital lobe [12], 
Significant activation is also observed in the left 
inferior occipitotemporal region and in the left 
middle-to-superior posterior temporal region [12], 
The MEG approach on dyslexic and normal subjects 

[13] has shown that the neural activity in the left 
inferior occipitotemporal region is important in the 
visual word perception. In our previous MEG study 

[14] using Japanese Kana-letter words, where each 
Kana represents monosyllabic reading, activities at 
150-250 ms after the onset of the word presentation 
were observed in the extrastriate visual cortices 
mainly at the ventral part of the occipital to 
occipitotemporal regions. 

In this work, we studied to delineate the neural 
activities underlying the visual word processing in 
the temporal region [15], which may be related to 
perception and recognition of written words. 


5.2 Discrimination task and analysis 

Eight right-handed subjects (ages of 23-33 yeas) 
participated in the experiment. A group of visual 
words consisting of three Japanese Kana-letters were 
selected from the list of the nouns that are most 
frequently used in journals and newspapers. The 
single words were presented in the central visual 
field centered at the gazing point. The subjects were 
instructed to discriminate whether the word was 
concrete or abstract, and to reply by pressing one of 
two mouse keys with the left fingers. Immediately 
after the key-press, the words were replaced with 
random dot patterns of the same luminance. 

MEG signals were measured over right and left 
temporal areas using a dual 37-channel SQUIDs. 
The recorded MEG signals were averaged at the 
stimulus onset. Localization of single ECD sources 
was applied for the main peak components in the 
averaged response. The obtained ECDs were 
assessed in terms of reliability and stability using the 
criteria described elsewhere [14], and those reliable 
and stable sources were selected. In all subjects, Tl- 
weighted MR tomographic images were obtained. 
The MR coordinates were transformed into the head 
coordinates, and the location of the selected ECDs 
was registered in the MR images. 






5.3 Results 

The first main large peak component was observed 
at latencies of 240-300 ms from the word onset. 
Although MEG sensors were located to record 
mainly responses from the temporal area, ECD 
sources in the left occipitotemporal region (upper 
photographs in Fig. 3) were localized from some of 
the averaged data. This activity seems to correspond 
to the left occipitotemporal activity obtained in the 
previous study using single words, where 
measurement areas covered the occipital visual 
cortices [14]. 
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Figure 3: ECD sources, with the direction of current 
indicated by bars, obtained for the discrimination 
task of single words. The ECDs are plotted on 
representative MR tomogrpahic images of different 
temporal regions of the left hemisphere. 


In the subsequent latencies, ECD sources were 
found in relatively restricted temporal regions, 
which were tentatively separated into the medial and 
lateral parts based on the anatomical landmarks 
(middle and lower photographs in Fig. 3). The 
medial temporal region includes the areas of para¬ 
hippocampal gyrus (para-HCG) at the inferior part 
and temporal insula (TP-IS). The lateral region 
includes posterior part of the superior temporal 
sulcus (STS) and the area from the superior temporal 
gyrus (STG) to the lower bank of the Sylvian fissure. 
These activities in multiple areas occurred in two 
distinct latency periods of responses at 290-390 and 
420-520 ms. 

In the right hemisphere, the amplitude of responses 
was generally lower than the left hemisphere 
responses. For the first response at 260-300 ms, 
reliable ECDs were not localized because of 
insufficient recording area. In the response at 300- 
360 ms, distributed sources in the homologous areas 
to the left hemisphere were observed. In the late 
activity occurring at 420-520 ms, sources were 
concentrated in an area around the Sylvian fissure 
that is homologous, but slightly anterior, to the 
ECDs in the left lateral temporal region. 

5.4 Discussion 

The first responses in the occipitotemporal region 
are compatible with our previous study, where word 
categorization task was used [14]. Those activities 
originated in the occipital to occipitotemporal higher 
visual cortices, including the fusiform gyrus. They 
were also left hemisphere do min ant and are thought 
taking part in the ventral visual pathway that 
conveys information of objects. These responses 
terminated before 300 ms. 

The subsequent activities after 290ms in the multiple 
temporal regions were commonly observed in the 
two response periods of 290-390 and 420-520 ms. In 
some previous imaging studies using PET and fMRI 
for normal subjects, Talairach coordinates are 
presented for the areas that are activated during the 
task of word reading. Those areas are overlapped 
with our present results in Fig. 3, including the 
inferior medial temporal region near para-HCG [16] 
and the area from STS to STG [16,17]. Although 
covert reading is common in the task of their studies, 
different modular processes of semantic (lexical/ 
semantic association) and phonological (coding) are 
suggested. 

In clinical studies on brain dysfunction, however, 
disorders of phonological and semantic processes 
are related with lesions in distinct left temporal areas 
[18]. Anomic aphasia is associated with lesions in 





the inferior-medial temporal area and selective 
anomia, i.e., specific deficits in naming, is induced 
by electric stimulation at the inferior temporal gyrus 
[19]. Further, the PET study on word retrieval for 
normal and neurological patients suggested that the 
inferior temporal region subserves intermediary role 
between the word concept memory in the association 
cortex and the phonological word formation in the 
peri-sylvian language area [20]. 

In the previous auditory MEG study [11], we used 
representative monosyllabic speech sounds of /a/, 
/ka/, /ha/, and /na/ as the stimuli, which elicited a 
prominent Nlm response. The localized Nlm ECDs 
were concentrated in an area in the auditory cortex, 
which is indicated by white box in Fig. 3 as the 
range of mean +SD across six subjects. It is clear 
that this area in the auditory cortex is close to, and 
overlapped partly, with the lateral temporal ECDs 
observed in this study. The dipole current of the 
ECDs, especially those in the response at 420-520 
ms, toward inferior-posterior direction agrees with 
that of the auditory Nlm source. Further, in the 
homologous peri-sylvian part of the right temporal 
lobe, the ECD location of 420-520 ms response was 
slightly anterior to the left activity. This relation is 
compatible with typical left-right asymmetry of the 
auditory cortical activities. Such coincidences 
strongly suggest the phonological role of the 
functions mediated in the left lateral superior region. 
On the other hand, the activities in the medial 
temporal region around para-HCG may be suggested, 
from the above clinical studies related to the inferior 
temporal lesions, to mediate an accessing process to 
word semantics. 

In reading words, comprehension of their meaning 
and encoding into phonology may take place 
simultaneously. In the case of familiar (frequently 
used) words, it is believed that the comprehension 
precedes. In this study we used fa mi liar words as 
stimuli, which were presented in Kana-letters. In 
books and newspapers, those words are not only 
written in Kana (phonogram) but also in Kanji 
(morphogram). Therefore, depending on the usual 
expression form of the words, parallel processes 
may occur in slightly different manners in which 
comprehension is preceding or phonology is 
preceding. The observation of common activities in 
multiple temporal cortices over two response periods 
may be the reflection of such parallel processing in 
word reading. The range of these activity periods 
extends to 300-500 ms after word presentation, 
which is compatible with the mean reaction time 
(most frequent RT) of about 700 ms in 
discriminating the words into abstract or concrete. 


6 Word composition 

6.1 Background 

From the basis of neuro-psychology and clinical 
studies on brain lesions that result in specific 
linguistic dysfunction, it is believed that written 
words are analyzed through distinct modules, such 
as lexical, phonological, semantic, and syntactic 
analysis. Noninvasive neuroimaging studies [12-17] 
on reading words in normal subjects have often 
indicated controversial modular functions in 
anatomically identical region or areas in the brain. 
As described in the preceding session, left inferior- 
to-superior temporal areas are deeply involved, but 
phonological and semantic analyses are not well 
distinguished. Here, we restrict the phonological 
function performed in the temporal region to 
acoustic phonology, i.e., encoding visual words into 
a form similar to auditory words, to separate from 
the articulatory phonology that is functioning in the 
frontal region for verbal output [12]. 

The acoustic phonology includes phonological 
working memory, which may be in operation when 
sentence is analyzed from word to word, keeping the 
previously encoded words in temporal memory and 
utilizing them online. In the present study, we 
focused on the phonological process that would take 
place in the left temporal region. In order that 
acoustic phonology is demanded, we devised a 
word-composing task using three-Kana letter-strings. 
The task required manipulation of encoded letters, 
i.e., phonological working memory. In this respect, 
results of our task would be compared with previous 
imaging studies where visual letters or words were 
rhymed [12, 21], 

6.2 Composition task and recording 

Eight right-handed subjects (ages of 21-34) 
participated in the experiment. Meaningless letter- 
strings consisting of three Japanese Kana letters 
were made from noun words by changing the 
spelling order. These nonword letter-strings were 
visually presented to subjects. They were instructed 
to compose a word by changing the order of the 
letters phonologically and covertly, but not 
graphically (Fig. 4). To facilitate the phonological 
processing, the visual presentation was limited to the 
shortest time (0.20-0.25 sec) that each subject 
needed to read. Further, the subjects practiced, first 
overtly and then covertly, before MEG measurement. 
During the overt practice, the performance of 
composing correct words was monitored. The overt/ 
covert composing was continued until subjects could 
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Figure 4: Visual presentation of three Kana letters 
(vertically in the traditional writing way). Subjects 
reorder the letters covertly and press a button when 
they found a word. 

perform the task fluently. A set of nonwords, which 
were different from those in the MEG measurement, 
was used. 

MEG responses were measured with a 204-channel 
whole-head SQUID magnetometer. During the 
measurement, subjects responded with key-press 
immediately after composing a word. Thus, the 
reaction time (RT) was simultaneously recorded. 
After the measurement, the responses were averaged 
over 200 trials, fdtered to 0.3-20Hz, and the dc 
offset was removed. Localization of single and 
multiple (2-3 sources) ECDs were conducted using a 
selected subset of sensors for the single ECDs and 
these subsets of sensors for the multiple ECDs. The 
selection of the calculated ECDs and registration on 
the individual MR images were basically the same 
as the preceding section. 

6.3 Results and discussion 

The RT of all subjects exceeded mostly 0.7 sec after 
the stimulus onset, where the most frequent RT 
averaged across subjects was in the range of 1.1-1.2 
sec. We describe here MEG components in latencies 
before 0.80 sec, which may not be affected by the 
motor activities associated with key-press. Taking 
into account the mean RT of 0.7 sec in word 
discrimination in the preceding study and the 
reading time of the nonwords of 0.2 sec, the 
observed RT suggests that roughly one-to-two 


mental readings, i.e., covert reordering of the letters, 
took place, in the majority of trials, before a word 
was composed. 

Bilateral responses were observed in all subjects at 
latencies of 0.10-0.80 sec, where 2 to 8 reliable 
ECDs were obtained in each hemisphere. In itial 
activities from about 0.1 sec were observed in the 
occipital cortex mostly in the extra-striate visual 
areas in two hemispheres. After 0.20 sec, activities 
started in the posterior superior temporal cortex at 
the STS to Sylvian fissure in both the hemisphere. 
These areas are commonly observed for a variety of 
language-related tasks [12, 16,17]. 

Figure 5 illustrates the summarized results of the 
localized ECDs that were not observed in word 
discrimination (see also Fig. 3). The new activity 
areas include the left parietal/temporal junction 
around the supramarginal gyrus and the left anterior 
temporal cortex. These activities lasted for long 
periods to 0.8 sec, which is close to the shortest RT 
of word composition. 

Comparing with previous imaging studies, the 
activity in the parietal/temporal junction is in 
agreement with the activation in the supramarginal 
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Figure 5: ECD activities in the left hemisphere that 
were specifically observed in word composition, but 
not in word discrimination. 
















gyrus found in letter rhyming task [21], where 
phonological store of visual letters is suggested. This 
notion is compatible with the hypothesized 
manipulation of phonology, or phonological 
working memory in our composition task. Only a 
part of subjects (3 out of 8 subjects) showed the 
activity in the anterior temporal cortex, for which 
higher order processes of semantic to linguistic 
analyses are suggested in the previous clinical and 
imaging studies [16,21], 
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